60 research outputs found

    Computing Optimal Experimental Designs via Interior Point Method

    Full text link
    In this paper, we study optimal experimental design problems with a broad class of smooth convex optimality criteria, including the classical A-, D- and p th mean criterion. In particular, we propose an interior point (IP) method for them and establish its global convergence. Furthermore, by exploiting the structure of the Hessian matrix of the aforementioned optimality criteria, we derive an explicit formula for computing its rank. Using this result, we then show that the Newton direction arising in the IP method can be computed efficiently via Sherman-Morrison-Woodbury formula when the size of the moment matrix is small relative to the sample size. Finally, we compare our IP method with the widely used multiplicative algorithm introduced by Silvey et al. [29]. The computational results show that the IP method generally outperforms the multiplicative algorithm both in speed and solution quality

    Global convergence of splitting methods for nonconvex composite optimization

    Full text link
    We consider the problem of minimizing the sum of a smooth function hh with a bounded Hessian, and a nonsmooth function. We assume that the latter function is a composition of a proper closed function PP and a surjective linear map M\cal M, with the proximal mappings of τP\tau P, τ>0\tau > 0, simple to compute. This problem is nonconvex in general and encompasses many important applications in engineering and machine learning. In this paper, we examined two types of splitting methods for solving this nonconvex optimization problem: alternating direction method of multipliers and proximal gradient algorithm. For the direct adaptation of the alternating direction method of multipliers, we show that, if the penalty parameter is chosen sufficiently large and the sequence generated has a cluster point, then it gives a stationary point of the nonconvex problem. We also establish convergence of the whole sequence under an additional assumption that the functions hh and PP are semi-algebraic. Furthermore, we give simple sufficient conditions to guarantee boundedness of the sequence generated. These conditions can be satisfied for a wide range of applications including the least squares problem with the 1/2\ell_{1/2} regularization. Finally, when M\cal M is the identity so that the proximal gradient algorithm can be efficiently applied, we show that any cluster point is stationary under a slightly more flexible constant step-size rule than what is known in the literature for a nonconvex hh.Comment: To appear in SIOP

    Calculus of the exponent of Kurdyka-{\L}ojasiewicz inequality and its applications to linear convergence of first-order methods

    Full text link
    In this paper, we study the Kurdyka-{\L}ojasiewicz (KL) exponent, an important quantity for analyzing the convergence rate of first-order methods. Specifically, we develop various calculus rules to deduce the KL exponent of new (possibly nonconvex and nonsmooth) functions formed from functions with known KL exponents. In addition, we show that the well-studied Luo-Tseng error bound together with a mild assumption on the separation of stationary values implies that the KL exponent is 12\frac12. The Luo-Tseng error bound is known to hold for a large class of concrete structured optimization problems, and thus we deduce the KL exponent of a large class of functions whose exponents were previously unknown. Building upon this and the calculus rules, we are then able to show that for many convex or nonconvex optimization models for applications such as sparse recovery, their objective function's KL exponent is 12\frac12. This includes the least squares problem with smoothly clipped absolute deviation (SCAD) regularization or minimax concave penalty (MCP) regularization and the logistic regression problem with 1\ell_1 regularization. Since many existing local convergence rate analysis for first-order methods in the nonconvex scenario relies on the KL exponent, our results enable us to obtain explicit convergence rate for various first-order methods when they are applied to a large variety of practical optimization models. Finally, we further illustrate how our results can be applied to establishing local linear convergence of the proximal gradient algorithm and the inertial proximal algorithm with constant step-sizes for some specific models that arise in sparse recovery.Comment: The paper is accepted for publication in Foundations of Computational Mathematics: https://link.springer.com/article/10.1007/s10208-017-9366-8. In this update, we fill in more details to the proof of Theorem 4.1 concerning the nonemptiness of the projection onto the set of stationary point

    A Non-monotone Alternating Updating Method for A Class of Matrix Factorization Problems

    Full text link
    In this paper we consider a general matrix factorization model which covers a large class of existing models with many applications in areas such as machine learning and imaging sciences. To solve this possibly nonconvex, nonsmooth and non-Lipschitz problem, we develop a non-monotone alternating updating method based on a potential function. Our method essentially updates two blocks of variables in turn by inexactly minimizing this potential function, and updates another auxiliary block of variables using an explicit formula. The special structure of our potential function allows us to take advantage of efficient computational strategies for non-negative matrix factorization to perform the alternating minimization over the two blocks of variables. A suitable line search criterion is also incorporated to improve the numerical performance. Under some mild conditions, we show that the line search criterion is well defined, and establish that the sequence generated is bounded and any cluster point of the sequence is a stationary point. Finally, we conduct some numerical experiments using real datasets to compare our method with some existing efficient methods for non-negative matrix factorization and matrix completion. The numerical results show that our method can outperform these methods for these specific applications

    Peaceman-Rachford splitting for a class of nonconvex optimization problems

    Full text link
    We study the applicability of the Peaceman-Rachford (PR) splitting method for solving nonconvex optimization problems. When applied to minimizing the sum of a strongly convex Lipschitz differentiable function and a proper closed function, we show that if the strongly convex function has a large enough strong convexity modulus and the step-size parameter is chosen below a threshold that is computable, then any cluster point of the sequence generated, if exists, will give a stationary point of the optimization problem. We also give sufficient conditions guaranteeing boundedness of the sequence generated. We then discuss one way to split the objective so that the proposed method can be suitably applied to solving optimization problems with a coercive objective that is the sum of a (not necessarily strongly) convex Lipschitz differentiable function and a proper closed function; this setting covers a large class of nonconvex feasibility problems and constrained least squares problems. Finally, we illustrate the proposed algorithm numerically
    corecore